1 Set Up

#Variables to put into model

  • Dose (counts)
  • Dose (mass)
  • Dose (volume)
  • Size (continuous)
  • Size (binned)
  • Polymer type
  • Shape
  • Organism Group (this is being used as a temporary substitute for body size)
  • Life Stage
  • Level of Biological Organization
  • Exposure Duration
  • Acute/Chronic (this values are only complete for fish, molluscs, crustaceans and algae)
  • Charge (negative/positive - categorical)
  • Zeta potential
  • Particle Source (commercial, generated in-house, or mmodified commercial particles e.g., milling)

Response Variables * Effect (Y/N) - Binary Data * Effect (Y/N) - Binary Data x Effect Score (binned organism level effects)

#Continous Variable Distributions

There are several continuous variables that we want to feed into our model. Before doing so, we’re going to check the distribution to see if any of them are skewed and need to be transformed.

Due to skewed data, the following categories are log10 transformed before modeling:

  • Dose (counts)
  • Dose (mass)
  • Dose (volume)
  • Size (continuous)
  • Exposure Duration

2 Model: Kitchen Sink

All Independent Variables, Response Variable: Effect (Y/N)

Full Model

## $nlevels
##         shape_f           org_f          life_f           bio_f acute.chronic_f 
##               3               3               3               5               2 
## 
## $levels
## $levels$shape_f
## [1] "Fiber"    "Fragment" "Sphere"  
## 
## $levels$org_f
## [1] "Crustacea" "Fish"      "Mollusca" 
## 
## $levels$life_f
## [1] "Adult"    "Early"    "Juvenile"
## 
## $levels$bio_f
## [1] "Cell"       "Organism"   "Population" "Subcell"    "Tissue"    
## 
## $levels$acute.chronic_f
## [1] "Acute"   "Chronic"
## 
## Call:
## lm(formula = effect_10 ~ ., data = aoc_setup_select_1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6570 -0.3312 -0.1912  0.4720  1.0980 
## 
## Coefficients:
##                                         Estimate Std. Error t value Pr(>|t|)
## (Intercept)                              1.00445    0.14865   6.757 1.64e-11
## log.dose.particles.mL.master            -0.11159    0.12526  -0.891 0.373059
## log.dose.mg.L.master                     0.04305    0.00792   5.435 5.84e-08
## log.dose.um3.mL.master                   0.09858    0.12554   0.785 0.432364
## log.size.length.um.used.for.conversions -0.35564    0.37626  -0.945 0.344615
## shape_fFragment                         -0.25198    0.04644  -5.426 6.15e-08
## shape_fSphere                           -0.26325    0.10149  -2.594 0.009533
## org_fFish                               -0.07452    0.02352  -3.169 0.001543
## org_fMollusca                           -0.16489    0.02650  -6.223 5.47e-10
## life_fEarly                             -0.06019    0.02189  -2.750 0.005997
## life_fJuvenile                          -0.11565    0.02356  -4.909 9.57e-07
## bio_fOrganism                           -0.33858    0.04773  -7.094 1.57e-12
## bio_fPopulation                         -0.61578    0.09575  -6.431 1.44e-10
## bio_fSubcell                            -0.12181    0.04525  -2.692 0.007141
## bio_fTissue                             -0.18136    0.05487  -3.305 0.000959
## log.exposure.duration.d                  0.06250    0.01479   4.225 2.45e-05
## acute.chronic_fChronic                   0.04233    0.02363   1.792 0.073272
##                                            
## (Intercept)                             ***
## log.dose.particles.mL.master               
## log.dose.mg.L.master                    ***
## log.dose.um3.mL.master                     
## log.size.length.um.used.for.conversions    
## shape_fFragment                         ***
## shape_fSphere                           ** 
## org_fFish                               ** 
## org_fMollusca                           ***
## life_fEarly                             ** 
## life_fJuvenile                          ***
## bio_fOrganism                           ***
## bio_fPopulation                         ***
## bio_fSubcell                            ** 
## bio_fTissue                             ***
## log.exposure.duration.d                 ***
## acute.chronic_fChronic                  .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4375 on 3499 degrees of freedom
## Multiple R-squared:  0.1078, Adjusted R-squared:  0.1037 
## F-statistic: 26.42 on 16 and 3499 DF,  p-value: < 2.2e-16

Stepwise Model - Both Directions

## 
## Call:
## lm(formula = effect_10 ~ log.dose.particles.mL.master + log.dose.mg.L.master + 
##     log.size.length.um.used.for.conversions + shape_f + org_f + 
##     life_f + bio_f + log.exposure.duration.d + acute.chronic_f, 
##     data = aoc_setup_select_1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6638 -0.3313 -0.1912  0.4728  1.0969 
## 
## Coefficients:
##                                          Estimate Std. Error t value Pr(>|t|)
## (Intercept)                              0.908561   0.084751  10.720  < 2e-16
## log.dose.particles.mL.master            -0.013430   0.007951  -1.689  0.09130
## log.dose.mg.L.master                     0.043341   0.007911   5.478 4.60e-08
## log.size.length.um.used.for.conversions -0.060808   0.024323  -2.500  0.01246
## shape_fFragment                         -0.260118   0.045263  -5.747 9.87e-09
## shape_fSphere                           -0.193184   0.048372  -3.994 6.64e-05
## org_fFish                               -0.074439   0.023514  -3.166  0.00156
## org_fMollusca                           -0.164055   0.026476  -6.196 6.45e-10
## life_fEarly                             -0.058559   0.021791  -2.687  0.00724
## life_fJuvenile                          -0.114927   0.023539  -4.882 1.09e-06
## bio_fOrganism                           -0.338795   0.047723  -7.099 1.51e-12
## bio_fPopulation                         -0.616139   0.095743  -6.435 1.40e-10
## bio_fSubcell                            -0.121917   0.045250  -2.694  0.00709
## bio_fTissue                             -0.181486   0.054869  -3.308  0.00095
## log.exposure.duration.d                  0.062327   0.014790   4.214 2.57e-05
## acute.chronic_fChronic                   0.041786   0.023615   1.770  0.07690
##                                            
## (Intercept)                             ***
## log.dose.particles.mL.master            .  
## log.dose.mg.L.master                    ***
## log.size.length.um.used.for.conversions *  
## shape_fFragment                         ***
## shape_fSphere                           ***
## org_fFish                               ** 
## org_fMollusca                           ***
## life_fEarly                             ** 
## life_fJuvenile                          ***
## bio_fOrganism                           ***
## bio_fPopulation                         ***
## bio_fSubcell                            ** 
## bio_fTissue                             ***
## log.exposure.duration.d                 ***
## acute.chronic_fChronic                  .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4374 on 3500 degrees of freedom
## Multiple R-squared:  0.1076, Adjusted R-squared:  0.1038 
## F-statistic: 28.15 on 15 and 3500 DF,  p-value: < 2.2e-16

3 Model: Crustacean Fitness, Organism Level Endpoints Only

3.1 Dose (mass)

Full Model

## 
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8447  -0.7063  -0.6178  -0.5142   2.0485  
## 
## Coefficients:
##                                                           Estimate Std. Error
## (Intercept)                                             -1.425e+00  7.388e-02
## logdose.mg.L.master                                      1.807e-01  4.837e-02
## size.length.um.used.for.conversions                     -3.807e-04  1.785e-04
## logdose.mg.L.master:size.length.um.used.for.conversions  1.057e-04  4.555e-05
##                                                         z value Pr(>|z|)    
## (Intercept)                                             -19.288  < 2e-16 ***
## logdose.mg.L.master                                       3.737 0.000187 ***
## size.length.um.used.for.conversions                      -2.133 0.032944 *  
## logdose.mg.L.master:size.length.um.used.for.conversions   2.320 0.020367 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1272.9  on 1307  degrees of freedom
## AIC: 1280.9
## 
## Number of Fisher Scoring iterations: 4

3.2 Dose (volume)

Full Model

## 
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8116  -0.6866  -0.6282  -0.5292   2.0869  
## 
## Coefficients:
##                                                             Estimate Std. Error
## (Intercept)                                               -2.176e+00  2.574e-01
## logdose.um3.mL.master                                      1.262e-01  4.049e-02
## size.length.um.used.for.conversions                       -1.176e-03  4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions  1.259e-04  4.533e-05
##                                                           z value Pr(>|z|)    
## (Intercept)                                                -8.451  < 2e-16 ***
## logdose.um3.mL.master                                       3.118  0.00182 ** 
## size.length.um.used.for.conversions                        -2.635  0.00842 ** 
## logdose.um3.mL.master:size.length.um.used.for.conversions   2.777  0.00549 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1277.6  on 1307  degrees of freedom
## AIC: 1285.6
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8116  -0.6866  -0.6282  -0.5292   2.0869  
## 
## Coefficients:
##                                                             Estimate Std. Error
## (Intercept)                                               -2.176e+00  2.574e-01
## logdose.um3.mL.master                                      1.262e-01  4.049e-02
## size.length.um.used.for.conversions                       -1.176e-03  4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions  1.259e-04  4.533e-05
##                                                           z value Pr(>|z|)    
## (Intercept)                                                -8.451  < 2e-16 ***
## logdose.um3.mL.master                                       3.118  0.00182 ** 
## size.length.um.used.for.conversions                        -2.635  0.00842 ** 
## logdose.um3.mL.master:size.length.um.used.for.conversions   2.777  0.00549 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1277.6  on 1307  degrees of freedom
## AIC: 1285.6
## 
## Number of Fisher Scoring iterations: 4
## 
## Call:
## glm(formula = effect_10 ~ logdose.particles.mL.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust_part, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5942  -0.6724  -0.6214  -0.5551   2.0530  
## 
## Coefficients:
##                                                                   Estimate
## (Intercept)                                                     -1.808e+00
## logdose.particles.mL.master                                      7.178e-02
## size.length.um.used.for.conversions                              2.881e-04
## logdose.particles.mL.master:size.length.um.used.for.conversions  1.331e-04
##                                                                 Std. Error
## (Intercept)                                                      1.355e-01
## logdose.particles.mL.master                                      2.177e-02
## size.length.um.used.for.conversions                              6.364e-05
## logdose.particles.mL.master:size.length.um.used.for.conversions  4.499e-05
##                                                                 z value
## (Intercept)                                                     -13.342
## logdose.particles.mL.master                                       3.297
## size.length.um.used.for.conversions                               4.528
## logdose.particles.mL.master:size.length.um.used.for.conversions   2.959
##                                                                 Pr(>|z|)    
## (Intercept)                                                      < 2e-16 ***
## logdose.particles.mL.master                                     0.000978 ***
## size.length.um.used.for.conversions                             5.96e-06 ***
## logdose.particles.mL.master:size.length.um.used.for.conversions 0.003090 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1280.0  on 1307  degrees of freedom
## AIC: 1288
## 
## Number of Fisher Scoring iterations: 4

plot

### volume

## 
## Call:
## glm(formula = effect_10 ~ logdose.um3.mL.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust_volume, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8116  -0.6866  -0.6282  -0.5292   2.0869  
## 
## Coefficients:
##                                                             Estimate Std. Error
## (Intercept)                                               -2.176e+00  2.574e-01
## logdose.um3.mL.master                                      1.262e-01  4.049e-02
## size.length.um.used.for.conversions                       -1.176e-03  4.464e-04
## logdose.um3.mL.master:size.length.um.used.for.conversions  1.259e-04  4.533e-05
##                                                           z value Pr(>|z|)    
## (Intercept)                                                -8.451  < 2e-16 ***
## logdose.um3.mL.master                                       3.118  0.00182 ** 
## size.length.um.used.for.conversions                        -2.635  0.00842 ** 
## logdose.um3.mL.master:size.length.um.used.for.conversions   2.777  0.00549 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1277.6  on 1307  degrees of freedom
## AIC: 1285.6
## 
## Number of Fisher Scoring iterations: 4

plot

3.3 Dose (count)

Full Model

## 
## Call:
## glm(formula = effect_10 ~ logdose.particles.mL.master * size.length.um.used.for.conversions, 
##     family = "binomial", data = m1_crust, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.5942  -0.6724  -0.6214  -0.5551   2.0530  
## 
## Coefficients:
##                                                                   Estimate
## (Intercept)                                                     -1.808e+00
## logdose.particles.mL.master                                      7.178e-02
## size.length.um.used.for.conversions                              2.881e-04
## logdose.particles.mL.master:size.length.um.used.for.conversions  1.331e-04
##                                                                 Std. Error
## (Intercept)                                                      1.355e-01
## logdose.particles.mL.master                                      2.177e-02
## size.length.um.used.for.conversions                              6.364e-05
## logdose.particles.mL.master:size.length.um.used.for.conversions  4.499e-05
##                                                                 z value
## (Intercept)                                                     -13.342
## logdose.particles.mL.master                                       3.297
## size.length.um.used.for.conversions                               4.528
## logdose.particles.mL.master:size.length.um.used.for.conversions   2.959
##                                                                 Pr(>|z|)    
## (Intercept)                                                      < 2e-16 ***
## logdose.particles.mL.master                                     0.000978 ***
## size.length.um.used.for.conversions                             5.96e-06 ***
## logdose.particles.mL.master:size.length.um.used.for.conversions 0.003090 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1311.5  on 1310  degrees of freedom
## Residual deviance: 1280.0  on 1307  degrees of freedom
## AIC: 1288
## 
## Number of Fisher Scoring iterations: 4

3.4 Discrete predictor and dose

## 
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size_f, family = "binomial", 
##     data = m1_crust, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7085  -0.7062  -0.6167  -0.4255   2.3881  
## 
## Coefficients:
##                                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                           -1.57691    0.21094  -7.476 7.67e-14 ***
## logdose.mg.L.master                    0.40501    0.16344   2.478   0.0132 *  
## size_f100nm < 1µm                     -0.01864    0.37454  -0.050   0.9603    
## size_f1µm < 100µm                      0.17395    0.22858   0.761   0.4467    
## size_f100µm < 1mm                     -0.25876    0.35626  -0.726   0.4676    
## size_f1mm < 5mm                       -0.87511    0.64223  -1.363   0.1730    
## logdose.mg.L.master:size_f100nm < 1µm  0.45050    0.31359   1.437   0.1508    
## logdose.mg.L.master:size_f1µm < 100µm -0.24482    0.17319  -1.414   0.1575    
## logdose.mg.L.master:size_f100µm < 1mm -0.23277    0.22994  -1.012   0.3114    
## logdose.mg.L.master:size_f1mm < 5mm    0.04129    0.21904   0.189   0.8505    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1358.5  on 1347  degrees of freedom
## Residual deviance: 1303.8  on 1338  degrees of freedom
## AIC: 1323.8
## 
## Number of Fisher Scoring iterations: 5

Plotted below. #### Acute only crustacea

## 
## Call:
## glm(formula = effect_10 ~ logdose.mg.L.master * size_f + logdose.particles.mL.master * 
##     size_f, family = "binomial", data = m1_crust_acute, na.action = "na.exclude")
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4072  -0.8096  -0.6639  -0.2631   2.1764  
## 
## Coefficients:
##                                               Estimate Std. Error z value
## (Intercept)                                    -30.504      9.991  -3.053
## logdose.mg.L.master                             -2.914      1.075  -2.711
## size_f100nm < 1µm                               29.153     10.270   2.839
## size_f1µm < 100µm                               28.853      9.996   2.886
## size_f100µm < 1mm                               28.939     10.003   2.893
## logdose.particles.mL.master                      2.912      1.001   2.908
## logdose.mg.L.master:size_f100nm < 1µm            3.637      1.165   3.123
## logdose.mg.L.master:size_f1µm < 100µm            3.002      1.078   2.785
## logdose.mg.L.master:size_f100µm < 1mm            2.298      1.148   2.002
## size_f100nm < 1µm:logdose.particles.mL.master   -2.905      1.047  -2.773
## size_f1µm < 100µm:logdose.particles.mL.master   -2.767      1.004  -2.756
## size_f100µm < 1mm:logdose.particles.mL.master   -1.988      1.153  -1.725
##                                               Pr(>|z|)   
## (Intercept)                                    0.00227 **
## logdose.mg.L.master                            0.00672 **
## size_f100nm < 1µm                              0.00453 **
## size_f1µm < 100µm                              0.00390 **
## size_f100µm < 1mm                              0.00382 **
## logdose.particles.mL.master                    0.00364 **
## logdose.mg.L.master:size_f100nm < 1µm          0.00179 **
## logdose.mg.L.master:size_f1µm < 100µm          0.00535 **
## logdose.mg.L.master:size_f100µm < 1mm          0.04526 * 
## size_f100nm < 1µm:logdose.particles.mL.master  0.00555 **
## size_f1µm < 100µm:logdose.particles.mL.master  0.00585 **
## size_f100µm < 1mm:logdose.particles.mL.master  0.08459 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 945.92  on 853  degrees of freedom
## Residual deviance: 904.63  on 842  degrees of freedom
## AIC: 928.63
## 
## Number of Fisher Scoring iterations: 5

The above glm is visualized below.

3.4.0.1 Alt Method using GG Plot

This approach achieves a similiar product as using the ggPredict() function, except it relies on ggplot(), which is more malleable and transparent. The general steps are to first create a new dataframe over 1000 values of size using expand.grid() then use predict() and plot() with geom_line() and colour=size.

3.5 Survival package

3.5.1 Cumulative Hazard by Time

### Probability of Survival by Time